Noise Perturbation Improves Supervised Speech Separation
نویسندگان
چکیده
Speech separation can be treated as a mask estimation problem where interference-dominant portions are masked in a timefrequency representation of noisy speech. In supervised speech separation, a classifier is typically trained on a mixture set of speech and noise. Improving the generalization of a classifier is challenging, especially when interfering noise is strong and nonstationary. Expansion of a noise through proper perturbation during training exposes the classifier to more noise variations, and hence may improve separation performance. In this study, we examine the effects of three noise perturbations at low signal-to-noise ratios (SNRs). We evaluate speech separation performance in terms of hit minus false-alarm rate and short-time objective intelligibility (STOI). The experimental results show that frequency perturbation performs the best among the three perturbations. In particular, we find that frequency perturbation reduces the error of misclassifying a noise pattern as a speech pattern.
منابع مشابه
Noise perturbation for supervised speech separation
Speech separation can be treated as a mask estimation problem, where interference-dominant portions are masked in a time-frequency representation of noisy speech. In supervised speech separation, a classifier is typically trained on a mixture set of speech and noise. It is important to efficiently utilize limited training data to make the classifier generalize well. When target speech is severe...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملSource-filter Based Clustering for Monaural Blind Source Separation
In monaural blind audio source separation scenarios, a signal mixture is usually separated into more signals than active sources. Therefore it is necessary to group the separated signals to the final source estimations. Traditionally grouping methods are supervised and thus need a learning step on appropriate training data. In contrast, we discuss unsupervised clustering of the separated channe...
متن کاملReal-Time Speech Separation by Semi-supervised Nonnegative Matrix Factorization
In this paper, we present an on-line semi-supervised algorithm for real-time separation of speech and background noise. The proposed system is based on Nonnegative Matrix Factorization (NMF), where fixed speech bases are learned from training data whereas the noise components are estimated in real-time on the recent past. Experiments with spontaneous conversational speech and real-life nonstati...
متن کاملAn iterative model-based approach to cochannel speech separation
Cochannel speech separation aims to separate two speech signals from a single mixture. In a supervised scenario, the identities of two speakers are given, and current methods use pre-trained speaker models for separation. One issue in model-based methods is the mismatch between training and test signal levels. We propose an iterative algorithm to adapt speaker models to match the signal levels ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015